Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM
نویسندگان
چکیده
Selecting appropriate words to compose a sentence is one common problem faced by non-native Chinese learners. In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences. By combining CWINDOW word embedding features and POS information, the best bidirectional LSTM model achieves accuracy 0.5138 and MRR 0.6789 on the HSK dataset. For 80.79% of the test data, the model ranks the groundtruth within the top two at position level.
منابع مشابه
Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis
Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify t...
متن کاملChinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners
Word Ordering Errors (WOEs) are the most frequent type of grammatical errors at sentence level for non-native Chinese language learners. Learners taking Chinese as a foreign language often place character(s) in the wrong places in sentences, and that results in wrong word(s) or ungrammatical sentences. Besides, there are no clear word boundaries in Chinese sentences. That makes WOEs detection a...
متن کاملDetecting Word Usage Errors in Chinese Sentences for Learning Chinese as a Foreign Language
Automated grammatical error detection, which helps users improve their writing, is an important application in NLP. Recently more and more people are learning Chinese, and an automated error detection system can be helpful for the learners. This paper proposes n-gram features, dependency count features, dependency bigram features, and single-character features to determine if a Chinese sentence...
متن کاملChinese Grammatical Error Diagnosis Using Single Word Embedding
Automatic grammatical error detection for Chinese has been a big challenge for NLP researchers. Due to the formal and strict grammar rules in Chinese, it is hard for foreign students to master Chinese. A computer-assisted learning tool which can automatically detect and correct Chinese grammatical errors is necessary for those foreign students. Some of the previous works have sought to identify...
متن کاملChinese-Speaking EFL Learners’ Performances of Processing English Consonant Clusters
Locke (1983), studying world languages, found that some have word-initial clusters, some word-final clusters, and others consonant clusters in both word-initial and wordfinal positions. Considering negative transfer, linguists would claim native speakers of tongues without consonant clusters can have difficulty in phonologically manipulating target items with consonant clusters. Chinese differs...
متن کامل